Catalogo dei prodotti della ricerca

Nowadays, neural networks are widely used in many applications as artificial intelligence models for learning tasks. Since typically neural networks process a very large amount of data, it is convenient to formulate them within the mean-field and kinetic theory. In this work we focus on a particular class of neural networks, i.e. the residual neural networks, assuming that each layer is characterized by the same number of neurons N, which is fixed by the dimension of the data. This assumption allows to interpret the residual neural network as a time-discretized ordinary differential equation, in analogy with neural differential equations. The mean-field description is then obtained in the limit of infinitely many input data. This leads to a Vlasov-type partial differential equation which describes the evolution of the distribution of the input data. We analyze steady states and sensitivity with respect to the parameters of the network, namely the weights and the bias. In the simple setting of a linear activation function and one-dimensional input data, the study of the moments provides insights on the choice of the parameters of the network. Furthermore, a modification of the microscopic dynamics, inspired by stochastic residual neural networks, leads to a Fokker-Planck formulation of the network, in which the concept of network training is replaced by the task of fitting distributions. The performed analysis is validated by artificial numerical simulations. In particular, results on classification and regression problems are presented.

MEAN-FIELD AND KINETIC DESCRIPTIONS OF NEURAL DIFFERENTIAL EQUATIONS / Herty, M.; Trimborn, T.; Visconti, G.. - In: FOUNDATIONS OF DATA SCIENCE. - ISSN 2639-8001. - 4:2(2022), pp. 271-298. [10.3934/fods.2022007]

MEAN-FIELD AND KINETIC DESCRIPTIONS OF NEURAL DIFFERENTIAL EQUATIONS

Herty M.;Trimborn T.;Visconti G.

2022

Abstract

Nowadays, neural networks are widely used in many applications as artificial intelligence models for learning tasks. Since typically neural networks process a very large amount of data, it is convenient to formulate them within the mean-field and kinetic theory. In this work we focus on a particular class of neural networks, i.e. the residual neural networks, assuming that each layer is characterized by the same number of neurons N, which is fixed by the dimension of the data. This assumption allows to interpret the residual neural network as a time-discretized ordinary differential equation, in analogy with neural differential equations. The mean-field description is then obtained in the limit of infinitely many input data. This leads to a Vlasov-type partial differential equation which describes the evolution of the distribution of the input data. We analyze steady states and sensitivity with respect to the parameters of the network, namely the weights and the bias. In the simple setting of a linear activation function and one-dimensional input data, the study of the moments provides insights on the choice of the parameters of the network. Furthermore, a modification of the microscopic dynamics, inspired by stochastic residual neural networks, leads to a Fokker-Planck formulation of the network, in which the concept of network training is replaced by the task of fitting distributions. The performed analysis is validated by artificial numerical simulations. In particular, results on classification and regression problems are presented.

Scheda breve

Scheda completa

	Anno di pubblicazione
	
			2022
		
	Parole chiave
	
			continuous limit; kinetic equation; machine learning; mean-field equation; residual neural network
		
	Tipologia
	
			01 Pubblicazione su rivista::01a Articolo in rivista
		
	Citazione
	
			MEAN-FIELD AND KINETIC DESCRIPTIONS OF NEURAL DIFFERENTIAL EQUATIONS / Herty, M.; Trimborn, T.; Visconti, G.. - In: FOUNDATIONS OF DATA SCIENCE. - ISSN 2639-8001. - 4:2(2022), pp. 271-298. [10.3934/fods.2022007]
		
	Appartiene alla tipologia:
	
			01a Articolo in rivista

File allegati a questo prodotto

File	Dimensione	Formato
Herty_Mean-field_2022.pdf solo gestori archivio Tipologia: Versione editoriale (versione pubblicata con il layout dell'editore) Licenza: Tutti i diritti riservati (All rights reserved) Dimensione 1.8 MB Formato Adobe PDF Contatta l'autore	1.8 MB	Adobe PDF	Contatta l'autore

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11573/1659731

Citazioni

ND

3

4

social impact